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The Picture Word Game, a nonverbal tesV^ the 
ability to use language*related symbols, was administered to 90 
lov-income urban students in grades 1 through 5. Ss were trained in 
one class period and then tested in class size groups within 2 days. 
In addition, previous scores on the Stanford Paragraph Mt^aning and 
Vocabulary Subtests were obtained from school records. Results 
indicated, that the training-based procedure had advantages for. 
assessing the language skills of low-income students who often have 
verbal deficits which inhibit th^m from displaying maximal competence 
on traditional tests. (Appendixes contain sample copies of the 
student's training booklet and of ' the final version of the test'.) 
(LH) 
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THE PICTURE WORD GAMEj A NONVERBAL TEST OF 
THE ABILITY TO USE LANGUAGE-RELATED SYMBOLS^ 

Louise Corman and Milton Budof f 
Reseeurch Institute for Educational Problems 



Budof f and his associates have developed a training-based 
procedure for assessing reasoning ability on nonverbal tasks 
(Budof f, 1970). In this assessment strategy training assumes 
a critical role» particularly for the child from a poor and/or 
nonwhite background «rtio may learn diilferent cognitive strategies 
in eiqpressive formats other than those presumed to be available 
by traditional tests. The training helps the child to narrow 
the cognitive gap between his previously learned problem-solving 
strategies and those in^licit to the problems he must ordinarily 
solve on the middle-class-biased tests he encounters* Inclusion 
of training in the assessment procedure also minimizes the arti- 
ficiality of the test situation. Repeated contacts with the 
materials in a context of support and teaching allow the school- 
failing child to develop a sense that he can be cos^tent. 
Without this competence boost, he tends not to perform at his 
best, iBplicitly expecting failure (Zigler, 1966) 

The essence of this assessnexit strategy, then, is to in^se 
SOTie control on the potentially negative effects on the child's 
test performance of prior life eaqperiences. Two types of effects 
can be considered: those which are due to problem solving 
^ esqperiences which differ from the abstracting verbal conceptual 
types of skills expected of school children, and the negative 
effects of failing to perform well on the tests the child has 
taken' during his school years* Test scores after training 
should reflect the child's ability under optimised conditions 
in which he is familiar with ^e task and its demands, has had^ 
success in solving problems similar to those on the test, and"^ 
has had the opportunity to learn and apply relevant strategies « 

The goal of this study was to iipply the rationale from 
training based assessment to tasks that are language related, 
since previous efforts had focused on nonverbal reasoning tasks* 
The specific aim* was to develop a procedure that would permit the 
low S|SS, lov school achieving child to show his/her proficiency 
in utilising language related symbols, following a systematic 
learning experience* In this task, the child learns a new sydbol 
system for common words that he understands, and must apply 
this understanding in generating or decoding sentences* K 
critical feature of this language measure is that it not make 
excessive demands on the previously developed verbal skills of 
students who mi^ have experienced difficulty with verbal tasks* 



2 



The Picture Word Game (PWG) was conceived as a modification 
of the Semantic Test of Intelligence (STI) constructed by Philip 
Rulon. The STI is a language-related measure in which the 
examinee learns to associate a geometric symbol with a pictured 
object or action (man, woman, runs, pushes) • The student must 
**read" the sequence of symbols and choose the appropriate picture 
from a multiple choice array. The task is an analogue to reading 
the words in a sentence and indicating the picture that best 
represents the meaning of the sentence. As such, it was thought 
that the task might represent a meetns by which one could examine 
verbally related competencies of children in a training-based 
format, without the necessity of verbal expressive materials or 
reading words, skills in which these cnildren are often deficient. 

The STI is administered as a timed test and consists of 
217 items, including 1:09 items with vie symbol, 49 with two 
symbols, 36 with three symbols, and 23 with four symbols. \The 
symbols for each noun a^d verb are introduced as single symbols, 
and defined by accompanying pictures in a multiple choice format 
on tuition pages which are not scored. Tuition pages are used 
to introduce 2-, 3-*, and 4-symbol "^sentences. Instructions 
are pantomdLmed. The symbols and the pictures which define their 
meaning are presented on each double page so that memory for the 
meaning of the symbol is not a factor influencing performance. 

The STI was developed as a measure of military trainability 
for illiterate recruits who had failed the literacy requirement 
for entry into the Marine Corps. Validity of the test as a 
language steasure was evidenced by Rulon and Schweiker*^* (1953) 
finding that recruits who did veil on the STI also successfully 
completed a literacy course to meet the eligibility requirements 
of the Marine Corps. 

In a previous study (Gimpn, Budoff, & Corman, 1974), the 
investigators administered the STI to 76 Spanish-speaking children 
from low income families, who ranged in age from 6 to 13 years • 
Hasults indicated that a ceiling effect occurred with children 
over 8 years old. The mean number of one-symbol items passed by 
over 50% of the total sample was 92 of 109 (84%) , and all but 
seven of the 49 two-symbol items were passed by more than half 
the children. These findings revealed that the majority of items 
on the test were too easy for the children in this age range. 
This investigation also Revealed that STI scores were signifi- 
cantly correlated with vocabulary scores on both the ydpanish and 
Etoglish versions of the WISC. The investigators concluded from 
these results that the tasks used in the STI were related to 
verbal ability, but that the difficulty level of the test would 
have to be broadened in order for it to be used with normal 
children spanning a wider age range. 
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The Picture Word Game was constructed to provide a 
minintally verbal mea3ure of language ability of children in the 
first through fifth grades. Development of the test was conducted 
in three phases: i) construction of the original test and 
accompanying training procedure^ ii) test development, i.e.# 
revision of the original test, and iii) evaluation of psychometric 
characteristics of the final test. 

Construction of the Original Test 

The following principles represent departures from the 
STI format and were used with the Picture Word Game in an effort 
^ broaden the difficulty level of the STI: 

1. The Picture Word Game is a power test, administered as 

an untimed measure. Eliminating the effect of speed in performance 
was considered desirable to reduce spurious inflation of varia- 
bility and consequently reliability. 

2. ^ The vocabulary in the test, i.e*, the number of symbols, 
was increased from nine ''words" in the STI to 16 in the Picture 
Word Game. 

3. The concept of symbols representing numbers was included,^ 
in addition to noun and verb represe^^tion. 

4. Whereas all items in the STI require translation from 
symbol to picture, approximately half the items in the Picture 
Word Game require translation from picture to symbol. The 
investigators believed that translation from picture to symbol 
would be a more difficult task, especially when three or more 
symbols were to be read. An internal measure of the effectiveness 
of the teaching sequence would be the extent to which the child 
can utilize the symbol in both directions — to decode the symbols 
into their pictorial equivalents, and to generate sentences in 
the ''new" language that would' explain the picture. (See 

Figure 1) . 

5. While the STI contained 217 items, including a large 
number of very easy one and two symbol items, the original 
version of the E^icture Word Game contained 6C items with a much 
smaller proportion of one and two symbol items. The intent vas 
to rejSuce the number of items on the final form even further so 
that the Picture word Game could be administered in one class 
period. ' 

6. While the most difficult items on the STI require ^ 
reading a "sentence" of four symbols, the picture Word Game 
included items of five symbols as well. It was not possible to 
construct a meaningful sentence of more than five symbols 

within this format, when these symbols represented only nouns, 
verbs, and numbers. 



FIGURE 1 \ 
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The original form of the Picture Word Game contained 60 
• items, with odd-numbered items requiring translation from symbol 
to picture and even-numbered items requiring translation from 
picture to symbol. The test included six one-si'mbol, 12 two- 
symbol, 18 three-symbol, 18 four-symbol, and 6 five-symbol items. 
Vocabulary comprised 16 symbols: five nouns (cat, dog, woman, 
boy/ horse), five intransitive verbs (sit, lie down, walk, 
stand, run) , three transitive verbs (pull, chase, carry), and 
three numbers (two, three, euid fouf). 

A training procedure was developed such that training could 
be entirely concluded before administration of the test, rather 
than embedding training within the test as in the STI. Oral 
instructions were used in the training, instead of relying solely 
on pantcwnimed instructions- The latter procedure used in the 
STI was considered artificial by the investigators on the basis 
t of a previous administration of the STI (Gimon, Budoff, & Gorman, 
1974). 

A training booklet was devised which contained 10 of the 1-6 
symbols used in the vocabulary of the test. A page with all 
the symbols in the vocabulary was attached to the back of the 
training booklet, so that it could be easily detached and referred 
to by the student during the training session. The six symbols 
contained in the test but excluded from training were one noun, 
one intransitive verb, one transitive verb, and the three numbers. 
The training booklet contained items similar to those in the 
testr and slides were made of each training item. The trainer 
simaltaneously displayed the slides while explaninig the principles 
of the tasks and techniques for solving them to students using 
the training booklet. 

T^st Revision 

Subjects and Procedure 

Pilot testing was conducted in June of 1973^ with a sample 
of 205 students from a low income school district in an urban 
community in Massachusetts. The subjects constituted ten first 
throxagh fifth grade oiassrooms, two classes per grade, and were 
evenly divided by sex. The mean age of the sample was 9 years, 
2 months, with a standard deviation of 1 year, 7 months. 

All students were trained in one class period and tested 
in class size groups in a second class period three days following 
training. A student's score was calculated as the percent of 
items he answered correctly of the total number of test items. 

Results 

The test showed a high degree of internal consistency 
reflected in a KR20 reliability coefficient of .95. The mean 



ERLC 



7 



5 



percent of correct responses for the total sample was 75.5 
(SD « 20.8). The mean percent correct for students In grades 
one to five, respectively, were: 62.7 (121.3), 63.0 (121.2), 
86.9 (til. 8), 79.5 (ill. 8),. and 89.1 (±11.8). These figures 
indicated that, despite efforts to make this test more difficult 
than the STI, a ceiling effect was found for students at or 
beyond the third grade. Mean scores by grade level reflected 
a dichotomy between first and second graders on the one hand 
anc' third through fifth graders. 

Selection ,of Items for the Final Test 

Difficulty levels ^nd discrimination indices were obtained 
for each item with the total sanqple and stibjects in each grade. 
Items were selected for the final test which met three criteria: 

i) the difficulty level of test items for the total sanqple was 
evenly distributed throughout the test, i.e., approximately 
equal numbers of easy and moderate-*to-hard items were retained « 

ii) to the extent it was possible, each item selected showed a 
gradual increase in difficulty from grades one to five, and iii) 
the discrimination of each item was not less than .25 for the 
total senile. These procedures w^re used in an attempt to 
produce test scores that would reflect a developmental trend 
and reduce the likelihood of a ceiling effect on the final test* 
The revised test consisted of 37 items, includlhg one one-symbol # 
7 tvo"*symbol, 9 three-symbol, 9 four-symbol, and 11 five-symbol 
itimfl^ 

other revisions based on pilot testing consisted of 
improvements in pictures used in specific items. An effort 
was made in both the test. and training booklets to sharpen 
pictures which the children had difficulty interpreting. 
Appendix A contains the final training Instructions and student's 
training booklet. The final version of the test is presented 
in Appendix B. 

Characteristics of the Final Test 

S\>bjects 

The saunple for this phase of the study, which was con-* 
ducted in January, 1974, consisted of 90 students frcm a low 
income area in Massachusetts. Stibjects were evenly divided 
into five first through fifth grade classrooms, one class per 
grade, and attended an elementary scftiool in the same urban \ 
community as the school from which the pilot sample was drawn, ij 
S^jects were evenly divided by sex. The mean age of the samplei 
was 8 years, 10 months (SD « i year, 9 months). 

Procedure 

All students were trained in one class period and tested 
in class size groups within two days following training. In 
addition, all students* scores on the Stanford Paragraph 
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Meaning and Vocabulary Subtest* were obtained. The Primary II 
level of this test had been administered to third through 
graders by the school when these students were in the third 
grade, and scores were obtained from their school records. 
First and second graders were group administered the Primary I 
level after Picture Word testing. had been coit^leted. 

Results \ 

Discrimination and difficulty levels of hnal test items 
are presented in Table 1. Discrimination indid^s of almost 
all items were high, and this fact was reflected in the high 
KR20 reliability coefficient of .93, Mean item difficulties 
revealed that several procedures used to modify the STI were 
successful in increasing the difficulty level of the Picture 
Wbrd Game when the total sample is considered: a) items . 
requiring translation from symbol to picture were significantly 
harder than' items which required translation from picture to 
symbol (F » 63.64, 1/85 df, p <.001); b) the difficulty of 
items increased linearly as the number of symbols increased 
from two to five, (P = 70.27, 3/255 df, p <.01), a'^^^*') ^e 
two most difficult sets of items on the test were itei^s^pi^ploying 
the concept of numbers which had. not been taught, and all pt 
which had three or more symbols (mean » 59.3%) and the five 
symbol items (mean = 53.4%). The STI does not include either 
of these item types. 

Despite the effectiveness of these procedures, however, 
mean total scores (in terms of percent correct) for each grade 
again revealed a ceiling effect for children beyond tiie third 
grade Whose average score exceeded 30% correct (Table 2). 
Total score meanfe reflected the .same dichotomy between pajes 
1 to 2 and 3 through 5 which was found on the ^i lot test, despite 
item selection procedures which retained items with the most 
marked linear developmental trend. The table indicates thjt 
the gap between second and third graders widened as the nuAber 
of symbols increased from three to four. Items involving \ 
number concepts and those requiring translation from picture? 
to symbol- also differentiated first ar>d second graders from^ 
the rest of the sample. 

Mean stanines on the Stanford Comprehension and Vocabulary 
subtests were relatively comparable among the five grades and 
revealed that these children's reading skills were in the 
moderate to low range in relation to national norms. The 
mean stanine for the total sample was 4.3 (±1.8) on comprehension 
and 4.1 (±1.4) on vocabulary. Scores on the Picture Word game 
were correlated with scores on these two subtests to provide 
evidence of validity of the Picture Wbrd Game as a language 
measure. Coefficients of .37 and .34 were obtained ''ith com- 
prehension and vocabulary, respectively. While these coefficients 
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TABLE 1 . 

Discrimination and Difficulty Levels of 
Pinal Picture Word Game Items 



Number of Stimulus NundDer 
Item Symbols Mod6^' Concept? Discrimination Difficulty 



1 

X 


1 


p 

XT 






04 




mm 












QO 






if 








fi1 
• OX 


A 




e 
o 






• 0^ 


• 7^ 


e 




c 






m ^ f 


\ • 9^ 




o 


p 






• *s *3 


04 


7 










KO 


Q2 


o 
o 




c 






44 


QO 


. Q 




p 






44 


79 




w 


c 






46 


79 


XX 


'J 


p 

IT 






4Q 


• OA 


12 

X A 




c 






- 57 


.88 


X «9 




p 






73 


. 80 


14 

X Y 




c 




8 


\ .56 


.61 


15 


4 


p 


ye 


S 


• 44 


.50 


16 


3 


s 


ye 


s 


. 65 


• 58 


17 


3 


p 


ye 


Is ' 


.47 


• 63 


18 


3 


s 






• 62 


• 87 


19 

X7 


4 


p 






.47 


• 77 


20 


4 


c 




s 


- 51 

• «/X 


.65 


21 


4 


p 


yfes 


• 46 


• 44 


22 


4 


q 


yes 


. 72 


. 77 


23 


4 


p 






.39 


.63 


24 


4 


s 


yes 


• 74 


.77 


25 


4 


p 


yes * 


• 59 


.68 


26 


4 


p 






• 62 


.63 


27 


5 


p 


yes 


• 67 


.58 


28 


5 


s 






.59 


.67 


29 


5 


p 






.65 


.70 


30 


5 


s 


yes 


• 74 


.61 


31 


5 


p 


yes 


• 62 


.41 


32 


5 


s 


yes 


• 67 


.64 


33 


5 


p 






• 62 


.38 


34 


5 - 


s 


yes 


^ ^75 


.58 


35 


5 


p 






• 58 


.43 


36 


5 




yes 


• 57 


* .43 


37 


5 








• 58 


.44 



* Items marked "P" require translation from picture to 
symbol; items marked "S" require translation from symbol to 
picture. 
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TABLE 2 

Mean Percent Correct by Grade on the Picture Wbrd Game 



Total 
Sample 

Grade 1 

Grade 2 

Grade 3 

Grade 4 

Grade 5 

Nuaber of 
Items 



Total Test' 



68.8 23.4 

45.0 21.1 

55.2 19.5 

^.4 11.5 

84.5 10.0 

87.7 8.5 



37 



Number of Symbols 



/ 



_ Two 

X sp 



89.7 16.2 

74.4 23.2 

92.3 11.9 

93.3 9.0 

95.3 1C|.7 



95.8 



.9 



JThree 
X SO 



73.0 26,7 
44.4 30.2 

65.1 23.1 

85.2 11.2 
86.1 8.3 
92.0 12.2 



_ Four 
X SD 



64.9 29.2 

46.1 32.4 

48.7 33.2 
71.9 9.8; 

84.0 13.6 

82.1 1.4.4 



_ Five 
X • SD 



53.4 35.0 
23.2 24.5 

25.5 26.9 
73.9 23.0 

75.6 21.7 
82.8 13.2 
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Stimulus Mode 



Picture to 
Synflaol 



SD 



63.5 24.5 

40.3 20.5 

47.4 21.1 
73.7 13.0 
81.9 9.8 
83.0 11.2 



Symbol 
Pic^ 



74.5 2 

50.0 2 

63.5 1 

87.4 1 

87.2 i; 

92.6 



19 



16 



Item 1 with one symbol was included in this group. 
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Nuaber of Symbols < 


Stimulus 


Mode 








It 


Two* 


Three 


Pour 


Five 


Picture to 
SvnboL- 


Symbol' to 
. Picture 


Number 
Concept 




) 


X 


SD 


X 


SD 


X 


SD 


X 


SD 


X 


SD 


X 


SD 


X 


SD 


N 


.4 


89.7 


16.2 


73.0 


26.7 


64.9 


' 29.2 


53.4 


35.0 


63.5 


24.5 


74.5 


24.1 


59.3 


30.5 


90 


.1 


'74.4 


23.2 


44.4 


30.2 


46.1 


32.4 


23.2 


24.5 


40.3 


20.5 


50.0 


24.7 


33.0 


23^6 


20 


,5 


92.3 


11.9 


65.1 


23. i; 

1 


48.7 


33.2 


25.5 


26.9 


47.4 


21.1 
13.0 


63.5 


19.8 


37.1 


25.9 


21 


,5 


93.3 


9.0 


85.2 


/ 

1,1.2 


71.9 


9.8 


73.9 


23.0 


73.7 


87.4 


13.0 


75.6 


20.0 


15 


0 


95.3 


10.7 


86.1 


a; 3 


84. 


13.6 


75.6 


21.7 


61.9 


9.8 


87.2 


11.7 


77.9 


16.6 


16 


5 


95.8 


5.9 


92.0 


12\2 


82.1 


14.4 


82.8 


13.2 


83.0 


11.2 


92,6 


7.6 


84.1 


13.9" 


18 




8 




9 




V 

9 


11 


19 




15 





me syinbol was included in this group. 
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are low in terms of concurrent validity ^ they are similar in 
magnitude to the validity coefficient of .aSS obtained by 
Rulon and Schweiker (1953) wirh the STI. 

Discus s^ion 

A training-based procedure clearly has advanj:ages for 
assessing language skills of low SES children. These children 
typically have a verbal deficit which inhibits them from 
displaying their maximal con^etenae on traditional verbal tests « 
Training provides a supportiv-^ 't which can permit them 

to demonstrate optlmxim perfo; ' 

The basic principles employed in the STI would appear to 
lend themitelWs to adaptation for use as a language measure with 
children of a broad age range. Although certain test construc- 
tion proceaures were effective in increasing the difficulty of 
the STI, attempts to broaden the difficulty level of the test 
as a whole were largely unsuccessful. Like the STI, the Picture 
Word Game was shown to be too easy tor children beyond the 
third grade, despite the fact that comprehension and vocabularly 
^skills of this sample were below average on natiQnal norms. 

In its present form the Picture Word (Same appea»8 to be 
most useful with second gradc^rs. It could be. argued that the 
training session was so beneficial in maxj-pdzing test performance 
that training was partially responsible for the ceiling that 
resulted in the test. Some might suggest Hihat training only 
items which require transilation from symbol to picture might 
lower the test ceiling and at the same time permit examination 
of transfer of training to the more difficult picture to symbol 
items on the test. It i« doubtful, however, thi^t thia procedure 
would effectively reduce the. test ceilixigst^ de^ite the fact 
that items involving number concepts or f ive^ symbols were 
excluded from the training, phi Idren beyond the third grade 
had little difficulty in translating such items on the test. 

Results of this study indicated that the PijtuiJe Wbrd 
Game measures a unitary ability which is related to language 
skills. It is likely that facility in translation required 
by this test is mastered by children of normal intelligence 
after .the third grade and that, within the STI format, quantita- 
tive modifications are not sufficient to tap the increasingly 
complex language skills learned by children in the intermediate 
grades. Use of the Picture Word 6a|ne in furt*--r research with 
educable mental retarded^^hildren might prove to be quite 
fruitful; it is unlikely that the cbiling effect would be 
evident with these children for whom a minimally verbal, 
training-based measure is clearly appropriate and could serve 
a critical need. ' - 
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